1,307 research outputs found
The sampling brain
Understanding the algorithmic nature of mental processes is of vital importance to psychology, neuroscience, and artificial intelligence. In response to a rapidly changing world and computational demanding cognitive tasks, evolution may have endowed us with brains that are approximating rational solutions, such that our performance is close to optimal. This thesis suggests one instance of the approximation algorithms, sample-based approximation, to be implemented by the brain to tackle complex cognitive tasks. Knowing that certain types of sampling is used to generate mental samples, the brain could also actively correct for the uncertainty comes along with the sampling process. This correction process for samples left traces in human probability estimates, suggesting a more rational account of sample-based estimations. In addition, these mental samples can come from both observed experiences (memory) and synthesised experiences (imagination). Each source of mental samples has unique role in learning tasks and the classical error-correction principle of learning can be generalised when mental-sampling processes are considered
Probabilistic biases meet the Bayesian brain
Bayesian cognitive science sees the mind as a spectacular probabilistic inference machine. But Judgment and Decision Making research has spent half a century uncovering how dramatically and systematically people depart from rational norms. This paper outlines recent research that opens up the possibility of an unexpected reconciliation. The key hypothesis is that the brain neither represents nor calculates with probabilities; but approximates probabilistic calculations through drawing samples from memory or mental simulation. Sampling models diverge from perfect probabilistic calculations in ways that capture many classic JDM findings, and offers the hope of an integrated explanation of classic heuristics and biases, including availability, representativeness, and anchoring and adjustment
Information seeking as chasing anticipated prediction errors
When faced with delayed, uncertain rewards, humans and other animals usually prefer to know the eventual outcomes in advance. This preference for cues providing advance information can lead to seemingly suboptimal choices, where less reward is preferred over more reward. Here, we introduce a reinforcement-learning model of this behavior, the anticipated prediction error (APE) model, based on the idea that prediction errors themselves can be rewarding. As a result, animals will sometimes pick options that yield large prediction errors, even when the expected rewards are smaller. We compare the APE model against an alternative information-bonus model, where information itself is viewed as rewarding. These models are evaluated against a newly collected dataset with human participants. The APE model fits the data as well or better than the other models, with fewer free parameters, thus providing a more robust and parsimonious account of the suboptimal choices. These results suggest that anticipated prediction errors can be an important signal underpinning decision making
A Visualization method for machine translation evaluation results
PACLIC 20 / Wuhan, China / 1-3 November, 200
Dynamics of quantum entanglement in the reservoir with memory effects
The non-Markovian dynamics of quantum entanglement is studied by the
Shabani-Lidar master equation when one of entangled quantum systems is coupled
to a local reservoir with memory effects. The completely positive reduced
dynamical map can be constructed in the Kraus representation. Quantum
entanglement decays more slowly in the non-Markovian environment. The
decoherence time for quantum entanglement can be markedly increased by the
change of the memory kernel. It is found out that the entanglement sudden death
between quantum systems and entanglement sudden birth between the system and
reservoir occur at different instants.Comment: 14 pages, 3 figure
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling for Zero-Shot Voice Cloning
Zero-shot speaker cloning aims to synthesize speech for any target speaker
unseen during TTS system building, given only a single speech reference of the
speaker at hand. Although more practical in real applications, the current
zero-shot methods still produce speech with undesirable naturalness and speaker
similarity. Moreover, endowing the target speaker with arbitrary speaking
styles in the zero-shot setup has not been considered. This is because the
unique challenge of zero-shot speaker and style cloning is to learn the
disentangled speaker and style representations from only short references
representing an arbitrary speaker and an arbitrary style. To address this
challenge, we propose U-Style, which employs Grad-TTS as the backbone,
particularly cascading a speaker-specific encoder and a style-specific encoder
between the text encoder and the diffusion decoder. Thus, leveraging signal
perturbation, U-Style is explicitly decomposed into speaker- and style-specific
modeling parts, achieving better speaker and style disentanglement. To improve
unseen speaker and style modeling ability, these two encoders conduct
multi-level speaker and style modeling by skip-connected U-nets, incorporating
the representation extraction and information reconstruction process. Besides,
to improve the naturalness of synthetic speech, we adopt mean-based instance
normalization and style adaptive layer normalization in these encoders to
perform representation extraction and condition adaptation, respectively.
Experiments show that U-Style significantly surpasses the state-of-the-art
methods in unseen speaker cloning regarding naturalness and speaker similarity.
Notably, U-Style can transfer the style from an unseen source speaker to
another unseen target speaker, achieving flexible combinations of desired
speaker timbre and style in zero-shot voice cloning
DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech -- A Study between English and Mandarin
While the performance of cross-lingual TTS based on monolingual corpora has
been significantly improved recently, generating cross-lingual speech still
suffers from the foreign accent problem, leading to limited naturalness.
Besides, current cross-lingual methods ignore modeling emotion, which is
indispensable paralinguistic information in speech delivery. In this paper, we
propose DiCLET-TTS, a Diffusion model based Cross-Lingual Emotion Transfer
method that can transfer emotion from a source speaker to the intra- and
cross-lingual target speakers. Specifically, to relieve the foreign accent
problem while improving the emotion expressiveness, the terminal distribution
of the forward diffusion process is parameterized into a speaker-irrelevant but
emotion-related linguistic prior by a prior text encoder with the emotion
embedding as a condition. To address the weaker emotional expressiveness
problem caused by speaker disentanglement in emotion embedding, a novel
orthogonal projection based emotion disentangling module (OP-EDM) is proposed
to learn the speaker-irrelevant but emotion-discriminative embedding. Moreover,
a condition-enhanced DPM decoder is introduced to strengthen the modeling
ability of the speaker and the emotion in the reverse diffusion process to
further improve emotion expressiveness in speech delivery. Cross-lingual
emotion transfer experiments show the superiority of DiCLET-TTS over various
competitive models and the good design of OP-EDM in learning speaker-irrelevant
but emotion-discriminative embedding.Comment: accepted by TASL
- …